{
 "cells": [
  {
   "cell_type": "markdown",
   "id": "52da442f",
   "metadata": {},
   "source": [
    "下面展示WordNet的使用示例。首先，从NLTK中引入WordNet，并且简写成wn。"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "id": "b65972e1",
   "metadata": {},
   "outputs": [],
   "source": [
    "# 若无法运行，请将下面两行注释取消\n",
    "# import nltk\n",
    "# nltk.download('omw-1.4')\n",
    "\n",
    "from nltk.corpus import wordnet as wn"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "1c0fce14",
   "metadata": {},
   "source": [
    "我们来看“cat”的词义（对应不同的同义词集）有哪些："
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "id": "8aaf527a",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "[Synset('cat.n.01'), Synset('guy.n.01'), Synset('cat.n.03'), Synset('kat.n.01'), Synset('cat-o'-nine-tails.n.01'), Synset('caterpillar.n.02'), Synset('big_cat.n.01'), Synset('computerized_tomography.n.01'), Synset('cat.v.01'), Synset('vomit.v.01')]\n"
     ]
    }
   ],
   "source": [
    "print(wn.synsets('cat'))"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "856d8bc7",
   "metadata": {},
   "source": [
    "接下来，我们来看“cat.n.01”，即cat作为名词的第一个同义词集的定义是什么样的："
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "id": "9b75fa0f",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "feline mammal usually having thick soft fur and no ability to roar: domestic cats; wildcats\n"
     ]
    }
   ],
   "source": [
    "print(wn.synset('cat.n.01').definition())"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "id": "aa67513e",
   "metadata": {},
   "outputs": [
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "[nltk_data] Downloading package omw-1.4 to D:\\nltk_data...\n",
      "[nltk_data]   Package omw-1.4 is already up-to-date!\n",
      "[nltk_data] Downloading package wordnet to D:\\nltk_data...\n",
      "[nltk_data]   Package wordnet is already up-to-date!\n"
     ]
    },
    {
     "data": {
      "text/plain": [
       "True"
      ]
     },
     "execution_count": 4,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "import nltk\n",
    "\n",
    "# 下载 omw-1.4 和 WordNet 数据集\n",
    "nltk.download('omw-1.4')\n",
    "nltk.download('wordnet')"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "4f8c9702",
   "metadata": {},
   "source": [
    "“cat.n.01”的词目以及在其他语言上的词目："
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 5,
   "id": "b5f82d34",
   "metadata": {
    "scrolled": false
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "[Lemma('cat.n.01.cat'), Lemma('cat.n.01.true_cat')]\n",
      "['にゃんにゃん', 'キャット', 'ネコ', '猫']\n"
     ]
    }
   ],
   "source": [
    "print(wn.synset('cat.n.01').lemmas())\n",
    "# 暂不支持中文\n",
    "print(wn.synset('cat.n.01').lemma_names('jpn'))\n"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "b4ba2cbf",
   "metadata": {},
   "source": [
    "最后，我们看一下“cat.n.01”的上位词和下位词："
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 6,
   "id": "c8f621fc",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "[Synset('feline.n.01')]\n",
      "[Synset('domestic_cat.n.01'), Synset('wildcat.n.03')]\n"
     ]
    }
   ],
   "source": [
    "# 上位词\n",
    "print(wn.synset('cat.n.01').hypernyms())\n",
    "# 下位词\n",
    "print(wn.synset('cat.n.01').hyponyms())"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 7,
   "id": "89dcaa8b",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "[Synset('feline.n.01')]\n",
      "[Synset('domestic_cat.n.01'), Synset('wildcat.n.03')]\n"
     ]
    }
   ],
   "source": [
    "# 上位词\n",
    "print(wn.synset('cat.n.01').hypernyms())\n",
    "# 下位词\n",
    "print(wn.synset('cat.n.01').hyponyms())"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "03428711",
   "metadata": {},
   "source": [
    "下面对比“boy.n.01”、“girl.n.01”、“cat.n.01”之间的相似度：\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 8,
   "id": "1768d060",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "boy和girl 0.16666666666666666\n",
      "boy和cat 0.08333333333333333\n",
      "girl和cat 0.07692307692307693\n",
      "boy和dog 0.14285714285714285\n",
      "girl和dog 0.125\n",
      "cat和dog 0.2\n"
     ]
    }
   ],
   "source": [
    "boy = wn.synset('boy.n.01')\n",
    "girl = wn.synset('girl.n.01')\n",
    "cat = wn.synset('cat.n.01')\n",
    "dog = wn.synset('dog.n.01')\n",
    "\n",
    "\n",
    "print(\"boy和girl\",boy.path_similarity(girl))\n",
    "print(\"boy和cat\",boy.path_similarity(cat))\n",
    "print(\"girl和cat\",girl.path_similarity(cat))\n",
    "print(\"boy和dog\",boy.path_similarity(dog))\n",
    "print(\"girl和dog\",girl.path_similarity(dog))\n",
    "print(\"cat和dog\",cat.path_similarity(dog))"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": ".venv",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.12.5"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 5
}
